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ENCODER WITH ADAPTIVE RATE CONTROL 



CROSS-REFERENCE TO RELATED APPLICATIONS 

This application claims the benefit of U.S. Provisional Application Serial No. 
5 60/540,634 (Attorney Docket No. PU040035), filed January 30, 2004 and entitled 
"ENCODER WITH ADAPTIVE RATE CONTROL", which is Incorporated by reference 
herein in its entirety. 

FIELD OF THE INVENTION 
10 The present invention relates generally video encoders and, more particularly, 

to an video encoder with adaptive rate control. 

BACKGROUND OF THE INVENTION 

Rate control is necessary in a Joint Video Team (JVT) video encoder to 

15 achieve particular constant bitrates, when needed for fixed channel bandwidth 
applications with limited buffer sizes. Avoiding buffer overflow and underftow is more 
challenging on video content that includes sections with different complexity 
characteristics, for example, sections with scene changes and dissolves. 

Rate control has been studied for previous video compression standards. 

20 TMN8 was proposed for H.263+. The TMN8 rate control uses a frame-layer rate 
control to select the target number of bits for the current frame and a macroblock- 
layer rate control to select the value of the quantization parameter (QP) for the 
macroblocks. 

In the frame-layer rate control, the target number of bits for the current frame is 
25 determined by 



B = R/F~A, (1) 

^ (W/F, W>Z»M 
A = •{ (2) 
yW — Z»M, otherwise 

W = max(W + B'-R / F,0) (3) 



30 



where B is the target number of bits for a frame, R is the channel rate in bits per 
second, F Is the frame rate in frames per second, W Is the number of bits in the 
encoder buffer, M is the maximum buffer size, W is the previous number of bits in 
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the buffer, S'is the actual number of bits used for encoding the previous frame, and Z 
= 0. t is set by default to achieve the low delay. 

The macroblock-layer rate control selects the value of the quantization step 
size for all the macroblocks in a frame, so that the sum of the macroblock bits is close 
5 to the frame target number of bits B, The optimal quantization step size e*for 
macroblock / in a frame can be determined by 

""'-{ttt^ it, 

10 where K is the model parameter, A is the number of pixels in a macroblock, iV,. is the 
number of macroblocks that remain to be encoded in the frame, is the standard 
deviation of the residue in the \fh macroblock, or, is the distortion weight of the \th 
macroblock, C is the overhead rate, and is the number of bits left for encoding the 
frame by setting A = ^ at the initialization stage. 

15 The TI\/IN8 scheme is simple and is known to be able to achieve both high 

quality and an accurate bit rate, but is not well suited to H.264. Rate-distortion 
optimization (RDO) (e.g., rate-constrained motion estimation and mode decision) is a 
widely accepted approach In H.264 for mode decision and motion estimation, where 
the quantization parameter (QP) (used to decide X In the Lagranglan optimization) 

20 needs to be decided before RDO Is performed. But the TMN8 model requires the 
statistics of the prediction error signal (residue) to estimate the QP, which means that 
motion estimation and mode decision needs to be performed before the QP Is 
determined, thus resulting in a dilemma of which dependent parameter must be 
calculated first, each value requiring knowledge about the other uncalculated value 

25 on which to base the determination. 

To overcome the dilemma mentioned above, a method (hereinafter the "first 
conventional method") proposed for H.264 rate control and Incorporated Into the JVT 
JM reference software release JM7.4 uses the residue of the collocated macroblock 
in the most recently coded picture with the same type to predict that of the current 

30 macroblock. Moreover, to also overcome the dilemma, another method (hereinafter 
the "second conventional method") proposed for H.264 rate control employs a two- 
step encoding, where the QP of the previous picture (ePp„v) is first used to generate 
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the residue, and then the QP of the current macroblock is estimated based on the 
residue. The former approach (i.e., the first conventional method) is simple, but it 
lacks precision. The latter approach (i.e., the second conventional method) is more 
accurate, but it requires multiple encoding, thus adding much complexity. 

5 

SUMMARY OF THE INVENTION 
These and other drawbacks and disadvantages of the prior art are addressed 
by the present invention, which is directed to an encoder with adaptive rate control. 
According to an aspect of the present invention, there is provided a video 
10 encoder for encoding image frames that are divisible Into macroblocks. The video 
encoder includes means for generating a quantization parameter (QP) estimate for 
the macroblocks of an image frame. The video encoder further Includes means for 
selection of a frame level QP for the image frame, using one of mean, median, and 
mode of QP estimates for the macroblocks. 
15 According to another aspect of the present invention, there Is provided a 

method for encoding image frames that are divisible into macroblocks. The method 
includes the step of generating a quantization parameter estimate for the 
macroblocks of an image frame. The method further includes the step of selecting a 
frame level QP for the image frame, using one of mean, median, and mode of QP 
20 estimates for the macroblocks. 

These and other aspects, features and advantages of the present invention will 
become apparent from the following detailed description of exemplary embodiments, 
which is to be read in connection with the accompanying drawings. 

25 BRIEF DESCRIPTION OF THE DRAWINGS 

The present invention may be better understood in accordance with the 
following exemplary figures, in which: 

FIG. 1 shows a block diagram for a video encoder; and 

FIG. 2 shows a flowchart for an encoding process with rate control in 
30 accordance with the principles of the present invention. 



DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS 
The present invention is directed to an encoder with adaptive rate control. 
Advantageously, the present invention avoids buffer overflow and underflow in a 
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video encoder, particularly in the case of video content that includes sections with 
different complexity characteristics. 

The present description illustrates the principles of the present invention. It will 
thus be appreciated that those skilled in the art will be able to devise various 
5 arrangements that, although not explicitly described or shown herein, embody the 
principles of the invention and are included within its spirit and scope. 

All examples and conditional language recited herein are intended for 
pedagogical purposes to aid the reader in understanding the principles of the 
invention and the concepts contributed by the inventor to furthering the art, and are to 
10 be construed as being without limitation to such specifically recited examples and 
conditions. 

Moreover, all statements herein reciting principles, aspects, and embodiments 
of the invention, as well as specific examples thereof, are intended to encompass 
both structural and functional equivalents thereof. Additionally, it is intended that 

15 such equivalents include both currently known equivalents as well as equivalents 
developed in the future, i.e., any elements developed that perform the same function, 
regardless of structure. 

Thus, for example, it will be appreciated by those skilled in the art that the 
block diagrams presented herein represent conceptual views of illustrative circuitry 

20 embodying the principles of the invention. Similarly, it will be appreciated that any 
flow charts, flow diagrams, state transition diagrams, pseudocode, and the like 
represent various processes which may be substantially represented in computer 
readable media and so executed by a computer or processor, whether or not such 
computer or processor is explicitly shown. 

25 The functions of the various elements shown in the figures may be provided 

through the use of dedicated hardware as well as hardware capable of executing 
software in association with appropriate software. When provided by a processor, 
the functions may be provided by a single dedicated processor, by a single shared 
processor, or by a plurality of individual processors, some of which may be shared. 

30 Moreover, explicit use of the term "processor" or "controller" should not be construed 
to refer exclusively to hardware capable of executing software, and may implicitly 
include, without limitation, digital signal processor ("DSP") hardware, read-only 
memory ("ROM") for storing software, random access memory ("RAM"), and 
non-volatile storage. 
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Other hardware, conventional and/or custom, may also be included. Similarly, 
any switches shown In the figures are conceptual only. Their function may be carried 
out through the operation of program logic, through dedicated logic, through the 
interaction of program control and dedicated logic, or even manually, the particular 
5 technique being selectable by the implementer as more specifically understood from 
the context. 

In the claims hereof, any element expressed as a means for performing a 
specified function is intended to encompass any way of performing that function 
including, for example, a) a combination of circuit elements that performs that 

10 function or b) software in any form, including, therefore, firmware, microcode or the 
like, combined with appropriate circuitry for executing that software to perform the 
function. The invention as defined by such claims resides in the fact that the 
functionalities provided by the various recited means are combined and brought 
together in the manner which the claims call for. Applicants thus regard any means 

15 that can provide those functionalities as equivalent to those shown herein. 

In FIG. 1, a video encoder is shown with an encoder 100 input connected in 
signal communication with a non-inverting input of a summing junction 110. The 
output of the summing junction 110 is connected in signal communication with a 
block transformer 120. The transformer 120 is connected in signal communication 

20 with a first input of a quantizer 130. The output of the quantizer 130 is connected in 
signal communication with a variable length coder ("VLC") 140, where the output of 
the VLC 140 is an externally available output of the encoder 100. A first input of a 
rate controller 177 is connected in signal communication with the output of the 
summing junction 1 10, a second input of the rate controller 177 is connected in signal 

25 communication with the output of the VLC 140, and an output of the rate controller 
177 is connected in signal communication with a second input of the quantizer 130. 

The output of the quantizer 130 is further connected in signal communication 
with an inverse quantizer 150. The inverse quantizer 150 is connected in signal 
communication with an inverse block transformer 160, which, in turn, is connected in 

30 signal communication with a reference picture store 170. A first output of the 
reference picture store 170 is connected in signal communication with a first input of 
a motion estimator 180. The input to the encoder 100 is further connected in signal 
communication with a second input of the motion estimator 180. The output of the 
motion estimator 180 is connected in signal communication with a first input of a 

35 motion compensator 190. A second output of the reference picture store 170 Is 
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connected in signal communication with a second input of tlie motion compensator 
190. The output of the motion compensator 190 is connected in signal 
communication with an Inverting input of the summing junction 1 1 0. 

Turning now to FIG. 2, an exemplary process for encoding image blocks with 
5 rate control Is generally indicated by the reference numeral 200. The process 
includes an initialization blocl< 205 which initializes a buffer, calculates the average 
target frame bits or average target Group Of Pictures (GOP) bits, sets the initial value 
of all the rate control related parameters, and so forth. The initialization block 205 
passes control to a loop limit block 210 which begins a first loop and sets 1=0 (range 0 
10 to f rame_number-1 ), and passes control to a decision block 21 5. 

In decision block 215. It is determined (for the current frame) whether the 
buffer fullness (buffer_fullness) is greater than a first threshold T1 and whether the 
available bits to code the frame (blt_budget) are less than a second threshold T2. 

If buffer_fullness is greater than T1 and/or bit_budget is less than T2, then 
15 control passes to function block 220, which performs virtual frame skipping, and 
passes control to an end loop block 275 for next frame (i<frame_number) or ends the 
first loop (i==frame_number). Othenwise, if buffer_fullness is less than or equal to T1 
and bit_budget is greater than or equal to T2, then control passes to a function block 
225. 

20 The function block 225 performs a pre-processing of the frame depending on 

the picture type to obtain an estimation of the prediction residual, and passes control 
to a function block 230. The pre-processing performed by function block 225 for I 
pictures may include intra coding using a subset of allowable Intra prediction modes 
to form predictions, and may use mean square error with respect to prediction 

25 residuals of the predictions formed using the subset to determine a best mode from 
among the subset of allowable Intra prediction modes. The pre-processing 
performed by function block 225 for P pictures may Include performing motion 
estimation with only the 16x16 block type and 1 reference picture. It is to be 
appreciated that as used herein, the phrase "best mode" refers to a prediction mode 

30 that results in the most accurate prediction for a given frame and/or Image block. 

The function block 230 performs frame quantization parameter (QP) estimation 
based on statistics generated by the pre-processing block 225,, and passes control to 
a loop limit block 235. The loop limit block 235 begins a second loop, sets j=0 (range 
0 to MB_number-1), and passes control to a decision block 240. 
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The decision blocic 240 determines wliether macroblock-level rate control is 
allowed. If macroblock-level rate control is not allowed, then control is passed to a 
function block 260, which codes every macroblock of the frame with the frame QP, 
and passes control to an end loop block 265 that ends the second loop. If 
5 macroblock-level rate control is allowed, then control passes to a function block 245, 
which estimates a QP for each macroblock according to the RD (rate-distortion) 
model and frame QP, and passes control to a function block 250. 

Function block 250 encodes a current macroblock, and passes control to a 
function block 255. Function block 255, which is performed after one macroblock is 
10 coded, updates the RD model along with other statistics, and passes control to end 
loop block 265. 

End loop bJock 265 passes control to a function block 270, which updates the 
buffer fullness and other statistics (e.g., the target bits for next frame and the 
parameters in the RD model) when a frame coding is finished, and passes control to 
15 the end loop block 275, which passes control to an end block 280 after all the frames 
are coded. 

A description will now be given of some of the many issues addressed by the 
present invention in providing adaptive rate control for encoding video data. The 
present invention builds upon the model used in TMN8 of H.263+. This model uses 

20 Lagrangian optimization to minimize distortion subject to the target bitrate constraint. 
To adapt the model into the International Telecommunication Union, 
Telecommunication Sector (ITU-T) H.264 standard and to further improve the 
performance, several issues have to be considered. First, rate-distortion optimization 
(RDO) (e.g., rate-constrained motion estimation and mode decision) is a widely 

25 accepted approach in H.264 for mode decision and motion estimation, where the 
quantization parameter (QP) (used to decide A In the Lagrangian optimization) needs 
to be decided before RDO is performed. But the TMN8 model requires the statistics 
of prediction error signal (residue) to estimate the QP, which means that motion 
estimation and mode decision needs to be performed before the QP is made, thus 

30 resulting in a dilemma of which dependent parameter must be calculated first, each 
value requiring knowledge about the other uncalculated value on which to base the 
determination. 

Second, TMN8 is targeted at low delay applications, but H.264 can be used for 
various applications. Therefore a new bit allocation and buffer management scheme 
35 is needed for various content. Third, TMN8 adapts the QP at the macroblock level. 
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Although a constraint is made on the QP difference (DQUANT) between the current 
macroblocl< and the last coded macroblock, subjective effects of large QP variations 
within the same picture can be observed and has a negative subjective effect. In 
addition, It is known that using a constant QP for the whole image may save 
5 additional bits for coding DQUANT, thus achieving higher PSNR for very low bit rate. 
Finally, H.264 uses 4x4 integer transform and if the codec uses some thresholding 
techniques such as in JM reference isoftware, details may be lost. Therefore, it is 
useful to adopt the perceptual model In the rate control to maintain the details. 



10 Preprocessing Stage 

From equation (4), it can be see that the TMN8 model requires the knowledge 
of standard deviation of the residue to estimate QP. However, RDO requires 
knowledge of the QP to perform motion estimation and mode decision to thus 
produce the residue. To overcome this dilemma, the first conventional method 

15 mentioned above uses the residue of the collocated macroblock In the most recently 
coded picture with the same type to predict that of the current macroblock, and the 
second conventional method mentioned above employs a two-step encoding, where 
the QP of the previous picture (QPp,^) is first used to generate the residue, and then 
the QP of current macroblock is estimated based on the residue. The former 

20 approach (i.e., the first conventional method) is simple, but it lacks precision. The 
latter approach (i.e., the second conventional method) is more accurate, but it 
requires multiple encoding, thus adding too much complexity. 

According to the present invention, a different approach is adopted to estimate 
the residue, which is simpler than the second convention method mentioned above, 

25 but more accurate than the first conventional method mentioned above. Experiments 
show that a simple preprocessing stage can give a good estimation of the residue. 
For an / picture, only the 3 most probable intra16x16 modes (vertical, horizontal and 
Discrete Cosine (DC) mode) are tested and the MSE (Mean Square Erfor) of the 
prediction residual is used to select the best mode. Only three modes are tested in 

30 order to reduce complexity. However, in other embodiments of the present invention, 
more or fewer modes can be tested while maintaining the spirit of the present 
Invention. The spatial residue is then generated using the best mode. It should be 
noted that the original pixel values are used for intra prediction instead of 
reconstructed pixels, simply because the reconstructed pixels are not available. 
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For P pictures, a rate-constrained motion searcin is performed using only tlie 
16x16 blocl< type and 1 reference picture. Tfie temporal residue is generated using 
the best motion vector in this mode. The average QP of the previously coded picture 
is used to decide ;i on a rate-constrained motion search. The experiment shows that 
5 by constraining the difference of the QP between the previous coded picture and the 
current picture, the X based on QP^^ has a minor impact on motion estimation. The 

side advantage of this approach is that the resultant motion vectors in the 
preprocessing step can be used as Initial motion vectors In the motion estimation 
during the encoding. 

10 

Frame-layer rate control 

TMN8 Is targeted to low-delay and low bit rate applications, which is assumed 
to encode only P pictures after the first / picture, hence the bit allocation model as 
shovyn In equation (1) should be re-defined to adapt to the various applications which 

15 use more frequent / pictures. The QP estimation model by equation (4) can result in 
large QP variation within one image, thus a frame-level QP is better first estimated to 
place a constraint on the variation of the macroblock (MB) QP. In addition, for very 
low bit rate, due to the overhead of coding the DQUANT, it may be more efficient to 
use a constant picture QP. Thus, a good rate control scheme should allow rate 

20 control at both the frame-level and the MB-level. 

A description will first be provided of a new bit allocation scheme in 
accordance with the principles of the present invention. Then, a description will be 
provided of a simple scheme to decide a frame-level QP in accordance with the 
principles of the present Invention. 

25 In many applications, e.g., in real-time encoders, the encoder does not know 

the total number of frames that need to be coded beforehand, or when scene 
changes will occur. Thus, a Group of Pictures (GOP) layer rate control is adopted to 
allocate target bits for each picture. The H.264 standard does not actually include 
Group of Pictures, but the terminology is used here to represent the distance 

30 between / pictures. The length of the GOP is indicated by N^p. If -> oo, then 
the following Is set Ncop = F , which corresponds to one second's length of frames. 
Notation 5G, ^ is used to indicate the remaining bits in the GOP / after coding picture 
equal to 
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mine /?G ._j + / F * N^^p , i? / F * N^Qp + M ♦ 0.2), J = 0 



otherwise 



(5) 



In the above equation, is tiie number of remaining bits after GOP i-1 is 

5 coded, given by RG,_,=R/F*N,„,,,-B^„,^,,yNhere 5,„,,Js the used bits and N,„,^/is 

the number of coded pictures after GOP / is finished. and B^j are the target bits 

and actual used bits for frame / of GOP /, respectively. In equation (5), one constraint 
is added on the total number of bits allocated for the GOP / to prevent buffer overflow 
when the complexity level of the content varies dramatically from one GOP to 

10 another. For example, consider a scenario where the previous GOP was of very low 
complexity, e.g., all black, so the buffer fullness level would go quite low. Instead of 
allocating all of the unused bits from the previous GOP to the current GOP, the 
unused bits are distributed over several following GOPs by not allowing more than 
0.2M additional bits to an individual GOP. The target frame bit Bf j is then allocated 

15 according to picture type. If the jth picture is P, then the target bits is 
Bfj - BG, jliK'N'+N''), where K'\s the bit ratio between / pictures and P pictures, 
which can be estimated using a sliding window approach, N' is the remaining 
number of / pictures in GOP / and N'' is that of P pictures; othenwise, bIj=k'bI'j. 
Since P pictures are used as the references by subsequent P pictures in the same 

20 GOP, more target bits are allocated for P pictures that are at the beginning of the 
GOP to ensure the later P pictures can be predicted from the references of better 
quality and the coding quality can be improved. A linear weighted P picture target bit 
allocation is used as follows: 

25 = R/F*0.2*iNaop -2j)l{Naop "2) (6) . 

Another constraint is added to better meet that target bits for a GOP as 

5,,.+ = 0.1*5^, 

30 

where B^^j_,= B,^j_, -B\^._„ and B^j.,= signiB,^^j_,)imui\B^^j_, \,R/F). 
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As an exemplary rate control according to the present Invention, a 50% buffer 
occupancy Is sought. To prevent buffer overflow or underflow, the target bits need to 
be jointly adapted with buffer level. The buffer level W is updated at the end of 
coding each picture by equation (3). According to the principles of the present 
5 invention, Instead of using the real buffer level to adjust the target bits, a virtual buffer 
level W' given by w =maxov,o.4Af) Is proposed. This helps prevent the scenario that if 
the previously coded pictures are of very low complexity such as black scenes and 
consume very few bits, then the buffer level will become very low. If the real buffer 
level is used to adjust target frame bits as In equation (7), too many bits may be 
10 allocated, which will cause the QP to decrease very quickly. After a while, when the 
scene retums to normal, the low QP will easily cause the buffer to overflow. Hence, it 
is needed to either Increase QP dramatically or skip the frames. This causes the 
temporal quality to vary significantly. Then, the bits are adjusted by buffer control as 
follows: 

15 

B, . = j * (2M - W')/(M + W ) (7) 



To guarantee a minimum level of quality, the following Is set 
Bf j =max(0.6* R/F,B.j). To further avoid the buffer overflow and underflow, the 
20 buffer safety top margin Wt and bottom margin Wb for an / picture are set as 
= 0.75M,and Wg = 0.25M . As for P pictures, compliant with equation (5) and to 
allow enough buffer for the next / picture In the next GOP, the following is set 
= (1- ((0.4- 0.2)/(iV-l)* 7 + 0.2)) *M, and w/ =0.1M. The final target bits are 
determined as follows. The following is set Wyi.=W + Bij, Wyg = Wyj.-R/F . If 
25 Wvt<Wt, B- = Wyr-Wr, else if Wyg < , B+ = - Wy^ . 

it is to be noted that if a scene change detector is employed, the picture at the 
scene change Is encoded to be an / picture and a new GOP starts from this / picture. 
The above scheme can still be employed. 

A new approach in accordance with the principles of the present invention is 
30 proposed to decide frame-level QP based on the macroblock-level QPs found in 
equation (4). Equation (4) Is modified as follows: 
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where cis the overhead from the last coded picture with the same type, <Ti is 
estimated in the preprocessing stage as described above. Two approaches in 
5 accordance with the principles of the present invention can be used to get frame-level 
constant QP, denoted as QPj. . The first approach Is to set a, = <t, , so that all the MB 

QPs are equal. The second approach is to use the same a; as that of the MB level, 
as defined hereinafter, then use the mean, median or mode of the histogram of the 
Qi values to find the QPj. . 

10 In a preferred embodiment of the present invention, the second approach to 

get QPj Is used to better match the MB QP. The frame-level quantization step size is 

decided by the mean of the j2» values, fi^=Xe, /N. It is noted that there is a 

i=l 

conversion between the quantization parameter QP and quantization step size Q by 
Q = 2<e''-6)/6 . To reduce the temporal quality variation between adjacent pictures, 
15 the following is set qp^ =maK{QP/-Df,Tian(QP^,QP/+D^)), where (2P/' is the frame QP 

of last coded frame, and D. W <OJM q^^^ scene changes usually cause 

[4 otherwise 

higher buffer levels, advantage Is taken of temporal masking effect and Df is set to be 
a higher value when a scene change occurs. 



20 MB-layer rate control 

A first key feature in MB-layer rate control pertains to the adaptive selection of 
weighted distortion or, to get a better perceptual quality. A second key feature is to 
reduce the variation of the MB QPs in the same picture. 

For low detail content, such as an ocean wave, a lower QP is required to keep 
25 the details. However, from an RDO point of view, a higher QP is preferred because 
the lower detail content tends to give a higher PSNR. To keep a balance, different 
settings of a, axe adopted for / and P pictures, respectively. For an / picture, a higher 
distortion weight is given to the MBs with less detail, so that the detail can be better 
retained. Accordingly, the following is set: 
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a, = (<7, + iG^Vi^i + <y^^\ where a^^ = Y^a.JN. 



For a P picture, a higher distortion weight is given to the IVlBs with more 
residue errors. Accordingly, 



In this way, better perceptual quality is nnaintained for an / picture and can be 
propagated to the following P pictures, while higher objective quality is still 
maintained. To prevent large variation of the quality inside one picture, the following 
is set Qp^ = max02Py. -2,min02/>,<2/> +2)) • If a frame level rate control is used, then 

Virtual frame skipping 

After encoding one picture, W is updated by equation (3). If W > 0.9M , the 
next frame is virtually skipped until the buffer level is below 0.9M. Virtual frame 
skipping is to code every MB in the P picture to be SKIP mode. In this way, a 
constant frame rate can be syntactically maintained. If the current frame is 
determined to be a virtual skipped frame, then the following is set QP^ = Qp'^ +2. 

In summary, the rate control scheme according to the present invention 
includes the following steps: preprocessing, frame target bits allocation and frame- 
level constant QP estimation, MB-level QP estimation, buffer updates and virtual 
frame skipping control. Advantageously, the present invention can allow both frame- 
level and MB-level rate control. 

A description will now be given of some of the many attendant 
advantages/features of the present invention, according to various illustrative 
embodiments of the present Invention. For example, one advantage/feature is the 
use of mean/median/mode of initial macroblock QP estimates to select frame level 
QP. Another advantage/feature is when the selected frame level QP is used in the 
calculation of the individual macroblock QPs. Yet another advantage/feature is when 
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performing intra prediction, using a subset of the allowable intra-prediction modes to 
form the residue that is used in the QP selection process. Moreover, another 
advantage/feature is the use of a small number of intra-prediction modes (three (3), 
for example). Also, another advantage/feature is when a previous GOP was coded 
5 with a large number of unused bits, limiting the additional bits allocated to the current 
GOP to a predetermined threshold. Still another advantage/feature is when a virtual 
buffer level instead of an actual buffer level is used for buffer control. 

These and other features and advantages of the present invention may be 
readily ascertained by one of ordinary skill in the pertinent art based on the teachings 

10 herein. It is to be understood that the teachings of the present invention may be 
implemented in various forms of hardware, software, firmware, special purpose 
processors, or combinations thereof. 

Most preferably, the teachings of the present invention are implemented as a 
combination of hardware and software. Moreover, the software is preferably 

15 implemented as an application program tangibly embodied on a program storage 
unit. The application program may be uploaded to, and executed by, a machine 
comprising any suitable architecture. Preferably, the machine is implemented on a 
computer platform having hardware such as one or more central processing units 
("CPU"), a random access memory ("RAM"), and input/output ("I/O") interfaces. The 

20 computer platform may also include an operating system and microinstruction code. 
The various processes and functions described herein may be either part of the 
microinstruction code or part of the application program, or any combination thereof, 
which may be executed by a CPU. In addition, various other peripheral units may be 
connected to the computer platform such as an additional data storage unit and a 

25 printing unit. 

It is to be further understood that, because some of the constituent system 
components and methods depicted in the accompanying drawings are preferably 
implemented in software, the actual connections between the system components or 
the process function blocks may differ depending upon the manner in which the 

30 present invention is programmed. Given the teachings herein, one of ordinary skill in 
the pertinent art will be able to contemplate these and similar implementations or 
configurations of the present invention. 

Although the illustrative embodiments have been described herein with 
reference to the accompanying drawings, it is to be understood that the present 

35 invention is not limited to those precise embodiments, and that various changes and 
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modifications may be effected therein by one of ordinary skill in the pertinent art 
without departing from the scope or spirit of the present invention. All such changes 
and modifications are intended to be included within the scope of the present 
invention as set forth in the appended claims. 



